Audiences

Overview

Audiences analyzes the various communities and networks that are discussing scientific papers on Twitter. Our goal is to help authors better understand who is engaging with their work.

Click on the tabs above to view various summaries of the papers analyzed.


  • Number of papers indexed: 1036

  • Total events (tweets and retweets) analyzed: 186,652

  • Total follower bios included in analysis: 249,225,949 (includes overlap)

Citation

These analyses are described in detail in the following paper:

Carlson J, Harris K. {Title}. Journal. 2019. doi:10.1186/s12864-018-5264-y

Click a journal or category to view a catalog of individual reports for the top articles.

Average fraction of users with > h%
white nationalist follower homophily
Journal Category Papers Analyzed Median Interdisciplinary Score h=2% h=5% h=10% h=20%
biorxiv animal-behavior-and-cognition 18 0.9 0.09 0.06 0.03 0.01
biorxiv biochemistry 8 0.9 0 0 0 0
biorxiv bioengineering 12 0.94 0 0 0 0
biorxiv bioinformatics 136 0.81 0 0 0 0
biorxiv biophysics 29 0.91 0.01 0 0 0
biorxiv cancer-biology 19 0.91 0.01 0 0 0
biorxiv cell-biology 36 0.9 0.01 0 0 0
biorxiv clinical-trials 3 0.87 0.02 0.01 0 0
biorxiv developmental-biology 15 0.87 0 0 0 0
biorxiv ecology 13 0.87 0.01 0 0 0
biorxiv epidemiology 6 0.89 0.03 0.01 0.01 0
biorxiv evolutionary-biology 80 0.89 0.03 0.02 0.01 0
biorxiv genetics 83 0.9 0.08 0.05 0.03 0.01
NPG genetics 49 0.91 0.04 0.02 0.01 0
biorxiv genomics 171 0.85 0.02 0.01 0.01 0
biorxiv immunology 14 0.9 0 0 0 0
biorxiv microbiology 62 0.89 0 0 0 0
biorxiv molecular-biology 28 0.9 0 0 0 0
biorxiv neuroscience 170 0.85 0.02 0.01 0 0
biorxiv paleontology 2 0.84 0.03 0.01 0.01 0
biorxiv pathology 2 0.95 0.19 0.09 0.01 0
biorxiv pharmacology-and-toxicology 2 0.89 0.01 0 0 0
biorxiv physiology 4 0.91 0.03 0.01 0.01 0
biorxiv plant-biology 25 0.83 0 0 0 0
biorxiv scientific-communication-and-education 27 0.98 0 0 0 0
biorxiv synthetic-biology 9 0.89 0.01 0 0 0
biorxiv systems-biology 13 0.83 0 0 0 0
NA zoology 0 NA NA NA NA NA

Academic demographics

Of the 1036 papers analyzed, our method estimates a higher fraction of the audiences are scientists than the Altmetric demographics for 1035 (100%) of these.

According to the Altmetric demographics, 543 of these papers (52%) are tweeted primarily by non-scientist audiences; our method estimates only 55 papers (5%) are primarily tweeted by non-scientist audiences.

These audience demographic comparisons are summarized in the plot to the right. Points are colored according to their bioRxiv category, and the size is relative to the number of tweets/retweets referencing the paper. Click on a point to open the individual report.

t-tests for differences in academic audience fraction based on Altmetric estimates (FDR-adjusted p-values)
bioinformatics biophysics cell-biology evolutionary-biology genetics genomics microbiology molecular-biology neuroscience plant-biology
biophysics 0.0000074 NA NA NA NA NA NA NA NA NA
cell-biology 0.0000114 0.7243686 NA NA NA NA NA NA NA NA
evolutionary-biology 0.6906818 0.0000851 0.0001655 NA NA NA NA NA NA NA
genetics 0.0000000 0.9867335 0.6726961 0.0000000 NA NA NA NA NA NA
genomics 0.5263421 0.0000341 0.0000578 0.8966080 0.0000000 NA NA NA NA NA
microbiology 0.3536944 0.0008431 0.0018045 0.6454238 0.0000059 0.6726961 NA NA NA NA
molecular-biology 0.0326792 0.1155632 0.1954941 0.0846482 0.0526351 0.0773599 0.2053965 NA NA NA
neuroscience 0.0000000 0.0000059 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 NA NA
plant-biology 0.0725999 0.0871328 0.1481373 0.1481373 0.0382611 0.1465633 0.3280698 0.8816018 0e+00 NA
scientific-communication-and-education 0.0008331 0.5093894 0.6997446 0.0041811 0.4081701 0.0027538 0.0218811 0.4169854 1e-07 0.3481334
t-tests for differences in academic audience fraction based on our estimates (FDR-adjusted p-values)
bioinformatics biophysics cell-biology evolutionary-biology genetics genomics microbiology molecular-biology neuroscience plant-biology
biophysics 0.9866242 NA NA NA NA NA NA NA NA NA
cell-biology 0.9862365 0.9862365 NA NA NA NA NA NA NA NA
evolutionary-biology 0.9862365 0.9862365 0.9862365 NA NA NA NA NA NA NA
genetics 0.0658840 0.3102228 0.3722011 0.0597703 NA NA NA NA NA NA
genomics 0.9862365 0.9866242 0.9862365 0.9862365 0.0597703 NA NA NA NA NA
microbiology 0.9862365 0.9862365 0.9862365 0.9862365 0.0597703 0.9862365 NA NA NA NA
molecular-biology 0.9862365 0.9862365 0.9862365 0.9866242 0.2710026 0.9862365 0.9862365 NA NA NA
neuroscience 0.0897880 0.4426868 0.5730702 0.0827869 0.9862365 0.0597703 0.0597703 0.3722011 NA NA
plant-biology 0.9866242 0.9866242 0.9862365 0.9862365 0.3800470 0.9862365 0.9862365 0.9862365 0.5730702 NA
scientific-communication-and-education 0.9862365 0.9862365 0.9862365 0.9862365 0.1884051 0.9862365 0.9862365 0.9862365 0.3004567 0.9862365

Interdisciplinary scores

For each paper, we calculated the cosine similarity between each of the academic audience topics and the most frequently-used words in the Wikipedia article corresponding to the research category under which the paper was submitted. We then calculated an interdisciplinary score as a weighted average of these cosine similarity scores, where the weights are the fraction of the academic audience associated with that topic:

\(ID_{score} = 1- \sum_{d \in D} w_d \times cos(\vec{d}, \vec{d}_{home})\)

We then normalized these scores to range from 0 to 1, thus, papers with \(ID_{score} \simeq 1\) have the most interdisciplinary academic audiences, and papers with \(ID_{score} \simeq 0\) have the most domain-specific academic audiences.

Lay audience network homophily

Many papers were found to have audience topics aligned with white nationalist rhetoric, reinforcing the qualitative observations made by scientific organizations, science journalists, and scientists themselves. To systematically quantify this trend, for each paper, we calculated the degree of network homophily (i.e., % overlap in followers) between each user and a curated set of prominent white nationalist accounts on Twitter. These plots show the distribution of white nationalist network homophily fraction (\(h\)) for the analyzed papers at four different thresholds (\(h=2\%\), \(h=5\%\), \(h=10\%\), and \(h=20\%\)).

h=2%

h=5%

h=10%

h=20%

About

Background

Audiences is a framework for exploring the various audiences that are engaging with academic publications on Twitter.

Paper metadata and associated Twitter data was collected using APIs from Crossref, Altmetric, Rxivist, and Twitter.

The code for Audiences is written in R, and this site was generated with Hugo, with a modified version of the Mondrian template.

All code used in these analyses is available on Github.

Setup

Prerequisites and dependencies

You will need a recent version of RStudio if you wish to use the interactive notebook capabilities.

Audiences requires the following R packages to run:

Twitter API access

Once you have a developer account set up, copy and paste the API keys into config.yaml

Running Audiences

render_reports.R is a wrapper script to generate the reports for a list of papers. report_template.rmd is an R Markdown-formatted template.

Serving as a webpage

The reports are formatted as interactive HTML documents, making them ideal to share with others on a website. Each report is a self-contained .html file, so you can simply to your own personal website. (e.g., if you have a list of your lab’s papers on your website, you can generate a report for each and add a link to the corresponding .html)

Alternatively, if you have forked the Audiences Github repository, you can use Github pages to host the reports.

I am using Hugo with the hugrid template to create a simple static landing page with tiles that link to static/reports/report.html. The website files are hosted in docs, and the Github project page is set up to point to this directory.

Setup Twitter API

To reproduce these analyses or run Audiences on your own paper(s), you will first need to set up a Twitter developer account for access to the Twitter API. Documentation for setting up a Twitter dev account is available here. Once completed, copy and paste the app name, consumer keys, and access keys into the appropriate fields of config.yaml.

Generate reports

Running render_reports.R will generate a separate report for each of the papers listed in papers.txt by their Altmetric URLs (one per line). Reports are based on the report_template.rmd RMarkdown template.

Output

As each report runs, data scraped from the Twitter API will be cached to article_data/ to . Subsequent runs will look for the appropriate .rds files in this directory

Reports will be written to output/reports/ and thumbnail images for each report to output/figures/.

 

Created with the audiences framework by Jedidiah Carlson

Powered by Hugo